deep learning inference
Accelerated Probabilistic Marching Cubes by Deep Learning for Time-Varying Scalar Ensembles
Han, Mengjiao, Athawale, Tushar M., Pugmire, David, Johnson, Chris R.
Visualizing the uncertainty of ensemble simulations is challenging due to the large size and multivariate and temporal features of ensemble data sets. One popular approach to studying the uncertainty of ensembles is analyzing the positional uncertainty of the level sets. Probabilistic marching cubes is a technique that performs Monte Carlo sampling of multivariate Gaussian noise distributions for positional uncertainty visualization of level sets. However, the technique suffers from high computational time, making interactive visualization and analysis impossible to achieve. This paper introduces a deep-learning-based approach to learning the level-set uncertainty for two-dimensional ensemble data with a multivariate Gaussian noise assumption. We train the model using the first few time steps from time-varying ensemble data in our workflow. We demonstrate that our trained model accurately infers uncertainty in level sets for new time steps and is up to 170X faster than that of the original probabilistic model with serial computation and 10X faster than that of the original parallel computation.
- Indian Ocean > Red Sea (0.06)
- Asia > Middle East > Yemen (0.06)
- Asia > Middle East > Saudi Arabia (0.06)
- (5 more...)
- Research Report (0.64)
- Workflow (0.49)
How to Accelerate TensorFlow on Intel Hardware
When deploying deep learning models, inference speed is usually measured in terms of latency or throughput, depending on your application's requirements. Latency is how quickly you can get an answer, whereas throughput is how much data the model can process in a given amount of time. Both use cases benefit from accelerating the inference operations of the deep learning framework running on the target hardware. Engineers from Intel and Google have collaborated to optimize TensorFlow* running on Intel hardware. This work is part of the Intel oneAPI Deep Neural Network Library (oneDNN) and available to use as part of standard TensorFlow.
Look-ups are not (yet) all you need for deep learning inference
McCarter, Calvin, Dronen, Nicholas
Fast approximations to matrix multiplication have the potential to dramatically reduce the cost of neural network inference. Recent work on approximate matrix multiplication proposed to replace costly multiplications with table-lookups by fitting a fast hash function from training data. In this work, we propose improvements to this previous work, targeted to the deep learning inference setting, where one has access to both training data and fixed (already learned) model weight matrices. We further propose a fine-tuning procedure for accelerating entire neural networks while minimizing loss in accuracy. Finally, we analyze the proposed method on a simple image classification task. While we show improvements to prior work, overall classification accuracy remains substantially diminished compared to exact matrix multiplication. Our work, despite this negative result, points the way towards future efforts to accelerate inner products with fast nonlinear hashing methods.
New Electronics - Breakthrough deep learning performance on a CPU
Deci's proprietary Automated Neural Architecture Construction (AutoNAC) technology automatically generated the new image classification models that improve all published models and deliver more than 2x improvement in runtime, coupled with improved accuracy, as compared to the most powerful models publicly available such as EfficientNets, developed by Google. While GPUs have traditionally been used in convolutional neural networks (CNNs), CPUs are a much cheaper alternative. Although it is possible to run deep learning inference on CPUs, they are significantly less powerful than GPUs and, as a result, deep learning models typically perform 3-10X slower on a CPU than on a GPU. DeciNets significantly close that performance gap so that tasks, that previously could not be carried out on a CPU because they were too resource intensive, are now possible. Additionally, these tasks will see a marked performance improvement.
DeepLINK: Deep learning inference using knockoffs with applications to genomics
Although practically attractive with high prediction and classification power, complicated learning methods often lack interpretability and reproducibility, limiting their scientific usage. A useful remedy is to select truly important variables contributing to the response of interest. We develop a method for deep learning inference using knockoffs, DeepLINK, to achieve the goal of variable selection with controlled error rate in deep learning models. We show that DeepLINK can also have high power in variable selection with a broad class of model designs. We then apply DeepLINK to three real datasets and produce statistical inference results with both reproducibility and biological meanings, demonstrating its promising usage to a broad range of scientific applications. Software data have been deposited in GitHub (). Preprocessed data matrices for the four publicly available data sets can be downloaded with the corresponding link: Zeller microbiome ([67][1]), Yu microbiome ([68][2]), murine scRNA-seq ([69][3]), and human scRNA-seq ([70][4]). [1]: #ref-67 [2]: #ref-68 [3]: #ref-69 [4]: #ref-70
Acceleration in Innovation! The Latest Breakthroughs in Conversational AI, Computer Vision and Recommender Systems with NVIDIA
It is a dynamic and threshold breaking time for advancing innovation across Conversational Artificial Intelligence, Computer Vision and Recommender Systems (RecSys), with NVIDIA accelerating new ground with the launch of TensorRT 8, alongside multiple RecSys competition successes - more on this news in depth shortly! But firstly, let's set the scene on exactly why this matters so much today. As we move into an Era of Convergence blending algorithm, engineering and culture alike, and reflecting both the level of integration and the increased pace of socio-technical change, it becomes imperative to manage the demands this inevitably creates - in order to optimise the vast opportunities. Deep learning is a case in point - applicable to a diverse and growing range of industries from medical devices through to conversational IVR and automated driving; and across a wide range of applications in production, including image and video analysis, natural language processing (NLP) and recommender systems. But as the number of applications increases so do the demands!
Deci and Intel look to optimise deep learning inference
The deep learning company, Deci, has announced a broad strategic business and technology collaboration with Intel to optimise deep learning inference on Intel Architecture (IA) CPUs. As one of the first companies to participate in the Intel Ignite startup accelerator, Deci will now work with Intel to deploy innovative AI technologies to mutual customers. The collaboration is intended to take a significant step towards enabling deep learning inference at scale on Intel CPUs, reducing costs and latency, and enabling new applications of deep learning inference. New deep learning tasks can be performed in a real-time environment on edge devices and companies that use large scale inference scenarios can dramatically cut cloud or datacentre cost, simply by changing the inference hardware from GPU to Intel CPU. "By optimising the AI models that run on Intel's hardware, Deci enables customers to get even more speed and will allow for cost-effective and more general deep learning use cases on Intel CPUs," said Deci CEO and co-founder Yonatan Geifman.
Intel hooks up with Deci for deep learning
As one of the first companies to participate in Intel Ignite startup accelerator, Deci will now work with Intel to deploy innovative AI technologies to mutual customers. The collaboration takes helps enable deep learning inference at scale on Intel CPUs, reducing costs and latency, and enabling new applications of deep learning inference. New deep learning tasks can be performed in a real-time environment on edge devices and companies that use large scale inference scenarios can dramatically cut cloud or datacenter cost, simply by changing the inference hardware from GPU to Intel CPU. "By optimizing the AI models that run on Intel's hardware, Deci enables customers to get even more speed and will allow for cost-effective and more general deep learning use cases on Intel CPUs," says Deci CEO and co-founder Yonatan Geifman. Deci and Intel's collaboration began with MLPerf where on several Intel CPUs, Deci's AutoNAC (Automated Neural Architecture Construction) technology accelerated the inference speed of the well-known ResNet-50 neural network, reducing the submitted models' latency by a factor of up to 11.8x and increasing throughput by up to 11x.
Deep Learning Inference with Azure ML Studio
In this project-based course, you will use the Multiclass Neural Network module in Azure Machine Learning Studio to train a neural network to recognize handwritten digits. Microsoft Azure Machine Learning Studio is a drag-and-drop tool you can use to rapidly build and deploy machine learning models on Azure. The data used in this course is the popular MNIST data set consisting of 70,000 grayscale images of hand-written digits. You are going to deploy the trained neural network model as an Azure Web service. Azure Web Services provide an interface between an application and a Machine Learning Studio workflow scoring model.
- Education > Educational Technology > Educational Software > Computer Based Training (0.43)
- Education > Educational Setting > Online (0.43)
Deep Learning Inference at Scale
Dashcams are an essential tool in a trucking fleet, both for the truck drivers and the fleet managers. Video footage can exonerate drivers in accidents, as well as provide opportunities for fleet managers to coach drivers. However, with a continuously running camera, there is simply too much footage to examine. When a KeepTruckin dashcam is paired with one of our Vehicle Gateways, the camera only automatically uploads the footage immediately preceding a driver performance event (DPE), which is an anomalous and potentially dangerous driver-initiated event (e.g. With all of the videos uploaded per day, fleet managers need to sift through the incoming data so that they can direct their attention to the most important videos for safety analysis. And of the selected videos for viewing, they need video overlays to more easily understand what happened in them.